NetNews Offline 2

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Offline 2 / NetNews Offline Volume 2.iso / news / comp / lang / c++-part1 / 5091 < prev next >

Wrap

Internet Message Format | 1996-08-06 | 5.4 KB

Path: news.clark.net!not-for-mail From: gusty@clark.net (Harlan Messinger) Newsgroups: comp.lang.c++ Subject: Re: Help with float/double data types Date: 31 Jan 1996 21:08:57 GMT Organization: Clark Internet Services, Inc., Ellicott City, MD USA Message-ID: <4eolp9$726@clarknet.clark.net> References: <96030.105940RWL380B@MAINE.MAINE.EDU> NNTP-Posting-Host: explorer.clark.net Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Newsreader: TIN [UNIX 1.3 950726BETA PL0] RWL380B@MAINE.MAINE.EDU wrote: : C++ Experts, : : : If I declare X, Y, Z and Zstd as float: : float X=0.0,Y=0.0,Z=0.0,Zstd=0.0; : Then the output looks correct but the digits after the decimal points : in X are incorrect (original first line X = -2082085.85000). The : first value of Y is incorrect, but the next values are correct! : : : X-Coordinate Y-Coordinate Z-Est Z StdDev n : -2082085.87500 -2127847.00000 23.12121 1.77931 8 : -2082085.87500 -2047847.12500 23.13800 1.73910 8 : -2082085.87500 -1967847.12500 23.15039 1.69751 8 : -2082085.87500 -1887847.12500 23.15789 1.65532 8 : -2082085.87500 -1807847.12500 23.15990 1.61336 8 : -2082085.87500 -1727847.12500 32.19845 1.55000 8 : -2082085.87500 -1647847.12500 38.46202 1.49013 8 : -2082085.87500 -1567847.12500 32.17017 1.45502 8 : -2082085.87500 -1487847.12500 39.51803 1.40543 8 : -2082085.87500 -1407847.12500 47.27689 1.37395 8 : : : If I declare X,Y,Z and Zstd to be double then the output is : correct. : : X-Coordinate Y-Coordinate Z-Est Z StdDev n : -2082085.85000 -2127847.12500 23.12121 1.77931 8 : -2082085.85000 -2047847.12500 23.13800 1.73910 8 : -2082085.85000 -1967847.12500 23.15039 1.69751 8 : -2082085.85000 -1887847.12500 23.15789 1.65532 8 : -2082085.85000 -1807847.12500 23.15990 1.61336 8 : -2082085.85000 -1727847.12500 32.19845 1.55000 8 : -2082085.85000 -1647847.12500 38.46202 1.49013 8 : -2082085.85000 -1567847.12500 32.17017 1.45502 8 : -2082085.85000 -1487847.12500 39.51803 1.40543 8 : -2082085.85000 -1407847.12500 47.27689 1.37395 8 : : Coming from Fortran, this should not be happening. Why not? The difference between real and double precision in Fortran is about the same as between float and double in C++. Borland's : documentation suggests that negative floats and doubles don't : exist. You are misreading it somehow. However, their web page has a technical document : describing how floating points are represented and indicate : that negative floats and doubles are OK. I am puzzled : by the behavior I describe above. And yes, I need access to : the X, Y and Z values as numbers for certain statistics I : need to include so I can't just read them as strings. : Any help is greatly Appreciated! : Floating point numbers are stored, at least on PCs, in binary scientific notation. The sign of the number is stored in one bit. The absolute value of the number is then expressed uniquely in the form s * 2^e (s times 2 to the e power), where s >= 0.5 but s < 1, and e is an integer that can be positive, negative or zero. The values s (significand) and e (exponent) are stored in some set of bits. On a PC, a 32-bit float uses one bit for the sign, 8 bits for the exponent and the remaining 23 bits for the significand. The significand is expressed as a binary fractional expansion; the first digit is chopped off (it is always 1, since s is at least 1/2 but less than 1, so to save space and allow greater precision the 1 is assumed instead of being saved explicitly); and the next 23 binary places are stored. The exponent e is stored as a signed integer. The number -2127847.12500 can be expressed in binary scientific notation as (-).1[00000011101111110011100]1000... times 2 to the 22nd power. The bits between brackets are the 23 bits that would be saved in a float. What remains after the right bracket is the rounding error, which is approximately 1 in the 25th position after the point--that is, 2 to the negative 25th power times 2 to the 22nd power = 2 to the negative 3rd power = 1/8 = 0.125, exactly the amount by which your printed output was off. The number -2047847.12500 in binary scientific notation is (-).1[11110011111101100111001] EXACTLY. This value can be specified fully in the 24 binary places provided (23 stored plus one imputed), so no rounding error occurs. I haven't checked your other numbers individually, but this gives you the general idea. How many digits of precision does one get from 24 binary places (bits)? The rightmost bit in the significand represents 2^(-24), which is about 6 times 10^(-8). If all the bits in the signficand are 1s, then the significand represents 1. Therefore, the precision of the signficand is equal to 6 parts per 100,000,000, allowing about seven signficant figures in decimal representation. The 8-bit exponent can vary from -127 to 128, so the magnitude of floats can range from 0.5 * 2^(-127) to 1.0 * 2^128. This is equivalent to a range from 10^(-38) to 10^38 in decimal terms. Doubles are stored in 64 bits on a PC: one bit for sign, 11 for exponent, and 52 for significand. The 52 bits allow for about 15 signficant digits. The 11 bits in the exponent allow for orders of magnitude from 10^(-308) to 10^308.